September 6, 2020

Introduction

  • The data was organized into train, test, validation with varying dimensions of scale to cater to the corresponding experiments
  • Kaggle data set not in standard MNIST format
  • Used techniques or functions to translate images into consumable formats like generators & other like:
    • flow_images_from_directory, image_data_generator, predict_generator etc.
    • image_to_array

Kaggle’s Data Set: 101.99 GB

Setting Up Pipelines

Experiment 1: Classification/Retrieval Using CNN

Experiment 1: Classification/Retrieval Using CNN

Experiment 2 & 3: Sample Data

Samples with Landmark ID 20409

Samples with Landmark ID 83144

Samples with Landmark ID 113209

Samples with Landmark ID 126637

Experiment 2: Binary/Multi-Class Classifiers (CNN)

The sample set was split into a training set (50%), validation set (25%), and a test set (25%):

  • training set — 3,433 images
  • validation set — 1,716 images
  • test set — 1,716 images

The same network architecture (except for the output layer activation function):

  • Binary Image Classifier — 2 labels, sigmoid function for the output layer
  • Multi-Class Image Classifier — 4 labels, softmax function for the output layer

Experiment 2: Binary/Multi-Class Classifiers (CNN)

After rounds of trial and errors with hyparameter-tuning, three models produced good results:

  • Model 1: rmsprop optimizer, learning rate 0.0001

Experiment 2: Binary/Multi-Class Classifiers (CNN)

  • Model 2: rmsprop optimizer, learning rate 0.0001, data augmentation, a dropout layer at rate 0.5
  • Model 3: rmsprop optimizer, learning rate 0.01, data augmentation, a dropout layer at rate: 0.5

Data Augmentation

CNN Model Accuracy Comparison

 

Model Accuracy (4 labels) Accuracy (2 labels)
CNN 1 91% 97.6%
CNN 2 89.8% 97.7%
CNN 3 93.1% 97.3%

CNN: Visualizing Intermediate Activations

What does the model see? Let’s look at an example.

Activation Maps: 8 Intermediate Layers

Experiment 3: Traditional Classifiers

Decision Tree

Decision Tree

Random Forest

k-Nearest Neighbors

SVM

Accuracy Comparison for All Models

 

Model Accuracy (4 labels) Accuracy (2 labels)
Decision Tree 53.71% 70%
Random Forest 63.54% 83.70%
KNN 56.73% (k=5) 80.34% (k=17)
SVM Radial 61.43% 78.86%
CNN 93.1% 97.7%

Scale-Invariant Feature Transform (SIFT)

Summary

References

  1. Google Landmark Retrieval 2020 https://www.kaggle.com/c/landmark-retrieval-2020
  2. T. Weyand, A. Araujo, B. Cao and J. Sim, Proc. Google Landmarks Dataset v2 - A Large-Scale Benchmark for Instance-Level Recognition and Retrieval. CVPR’20 https://arxiv.org/abs/2004.01804
  3. Tensorflow for R https://tensorflow.rstudio.com/tutorials/beginners/basic-ml/tutorial_basic_classification
  4. Kaggle tutorials https://www.kaggle.com/learn/deep-learning
  5. François Chollet, J. J. Allaire. Deep Learning with R. https://www.manning.com/books/deep-learning-with-r